Covariance Regularization for Supervised Learning in High Dimensions
نویسندگان
چکیده
This paper studies the effect of covariance regularization for classification of high-dimensional data. This is done by fitting a mixture of Gaussians with a regularized covariance matrix to each class. Three data sets are chosen to suggest the results are applicable to any domain with high-dimensional data. The regularization needs of the data when pre-processed using the dimensionality reduction techniques principal component analysis (PCA) and random projection are also compared. Observations include that using a large amount of covariance regularization consistently provides classification accuracy as good if not better than using little or no covariance regularization. The results also indicate that random projection complements covariance regularization.
منابع مشابه
Classification by semi-supervised discriminative regularization
Linear discriminant analysis (LDA) is a well-known dimensionality reduction method which can be easily extended for data classification. Traditional LDA aims to preserve the separability of different classes and the compactness of the same class in the output space by maximizing the between-class covariance and simultaneously minimizing the within-class covariance. However, the performance of L...
متن کاملCS 545 : Assignment 8 Dan
Mixture of Probabilistic Principal Component Analyzers (MPPCA) is a seminal work in Machine Learning in that it was the first to use PCA to perform clustering and local dimensionality reduction. MPPCA is based upon the mixture of Factor Analyzers (MFA) which is similar to MPPCA except is uses Factor Analysis to estimate the covariance matrix. This algorithm is of interest to me because it is re...
متن کاملLearning with Limited Supervision by Input and Output Coding
In many real-world applications of supervised learning, only a limited number of labeled examples are available because the cost of obtaining high-quality examples is high or the prediction task is very specific. Even with a relatively large number of labeled examples, the learning problem may still suffer from limited supervision as the dimensionality of the input space or the complexity of th...
متن کاملSemi-supervised Learning by Higher Order Regularization
In semi-supervised learning, at the limit of infinite unlabeled points while fixing labeled ones, the solutions of several graph Laplacian regularization based algorithms were shown by Nadler et al. (2009) to degenerate to constant functions with “spikes” at labeled points in R for d ≥ 2. These optimization problems all use the graph Laplacian regularizer as a common penalty term. In this paper...
متن کاملLearning convolution filters for inverse covariance estimation of neural network connectivity
We consider the problem of inferring direct neural network connections from Calcium imaging time series. Inverse covariance estimation has proven to be a fast and accurate method for learning macroand micro-scale network connectivity in the brain and in a recent Kaggle Connectomics competition inverse covariance was the main component of several top ten solutions, including our own and the winn...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010